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Abstract 



In the present work, we investigate the computational efficiency afforded by higher- 
order finite-element discretization of the saddle-point formulation of orbital-free density- 
■ functional theory. We first investigate the robustness of viable solution schemes by ana- 

lyzing the solvability conditions of the discrete problem. We find that a staggered solution 
p I procedure where the potential fields are computed consistently for every trial electron- 

density is a robust solution procedure for higher-order finite-element discretizations. We 
next study the numerical convergence rates for various orders of finite-element approx- 
imations on benchmark problems. We obtain close to optimal convergence rates in our 
studies, although orbital- free density- functional theory is nonlinear in nature and some 
benchmark problems have Coulomb singular potential fields. We finally investigate the 
computational efficiency of various higher-order finite-element discretizations by mea- 
suring the CPU time for the solution of discrete equations on benchmark problems that 
include large Aluminum clusters. In these studies, we use mesh coarse-graining rates that 
are derived from error estimates and an a priori knowledge of the asymptotic solution of 
the far-field electronic fields. Our studies reveal a significant 100-1000 fold computational 
savings afforded by the use of higher-order finite-element discretization, alongside provid- 
ing the desired chemical accuracy. We consider this study as a step towards developing 
a robust and computationally efficient discretization of electronic structure calculations 
using the finite-element basis. 



1 Introduction 

Electronic structure calculations have been successful in accurately predicting a wide 
range of material properties over the course of past few decades. These predictions 
range from the accurate description of phase transformations in various materials to the 
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electronic and optical properties of nanostructures. The predictive capability of elec- 
tronic structure calculations can be attributed to the fact that they are derived from 
many-body quantum-mechanics and incorporate much of the fundamental physics with 
little empiricism. One of the most popular and widely used electronic structure the- 
ories is the Kohn-Sham approach to density functional theory (DFT) [1]. It is based 
on the Hohenberg-Kohn theorem [2] which states that there is a one-to-one correspon- 
dence between the ground-state electronic wave-function of a quantum-mechanical system 
with N interacting electrons and the ground-state electron-density. Thus, based on the 
Hohenberg-Kohn result, the ground-state properties of any quantum-mechanical system 
can be described by an energy functional of the electron-density. While the existence of 
such an energy functional to describe the ground-state properties is known, its functional 
representation is unknown to date. The Kohn-Sham approach circumvents this challenge 
by considering an equivalent system of non-interacting electrons moving in an effective 
mean field governed by the electron-density. While, in principle, this non-interacting inde- 
pendent particle description is exact for ground-state properties, it is formulated in terms 
of an unknown exchange-correlation functional of the electron-density for which various 
approximate models are used. Further, the Kohn-Sham approach is computationally ex- 
pensive for large scale simulations (thousands of electrons or more), since it requires an 
expensive evaluation of the kinetic energy functional that involves the computation of the 
N independent single-electron wave-functions self-consistently, which scales as O(iV^). In 
order to reduce the computational complexity of electronic structure calculations using 
DFT, approximate models for the kinetic energy density functionals have been proposed 
and used in electronic structure calculations (cf. e.g., [31 HI O [H [7] ) . This approach is 
referred to as orbital-free DFT where the ground-state energy is explicitly described as a 
functional of the electron-density. Various numerical studies have shown that the orbital- 
free approximation to DFT is reasonable for systems with electronic structure close to a 
free-electron gas (cf. e.g. [6l[71[8]), for e. g. simple metals and Aluminum, but provides an 
inaccurate description for covalently bonded and ionic systems. Irrespective of whether 
one is using the Kohn-Sham or orbital-free approach, realistic simulations of large-scale 
material systems with DFT are still very demanding, and numerical algorithms which 
are robust, computationally efficient and scalable on parallel computing architectures are 
always desirable to enable simulations at larger scales and on more complex systems. 

Plane-wave basis has been the most popular basis set used in DFT calculations to 
date as it renders an efficient computation of electrostatic interactions naturally through 
Fourier transforms [HI [IHl Ell [E] • However, a plane- wave basis often restricts the geom- 
etry of simulation domains to periodic systems, which is incompatible with the displace- 
ment fields produced by most crystalline defects. Further, a plane-wave basis provides a 
uniform spatial resolution which is computationally inefficient in the study of defects and 
non-periodic systems like molecules and clusters — a higher resolution is often desired to 
describe specific regions of interest, such as a defect-core, but a coarser resolution suf- 
fices elsewhere. Moreover, the non-locality of a plane-wave basis affects the scalability of 
computations on massive parallel computing architectures. Thus, there is an increasing 
thrust towards using real-space techniques for electronic structure calculations (cf. |13| 
and references therein for a comprehensive overview). Among the real-space techniques, 
many efforts have focused on developing the finite-element basis for electronic structure 
calculations as it can accommodate unstructured coarse-graining of the basis set, can 
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handle complex geometries and boundary conditions, and is easy to parallelize. We re- 
fer to [la [TSl [HI [11 [II EOl Ell Ea [Ml ESI and references therein for 
a comprehensive overview. The convergence of the finite-element approximation using 
a variational notion of F— convergence has been shown for the case of orbital-free DFT 
(with Thomas-Fermi- von Weizsacker family of kinetic energy functionals) [22] as well as 
for Kohn-Sham DFT |25j . Theoretical rates of convergence for the finite-element approx- 
imation of orbital-free DFT problem with pseudopotentials have been recently estimated 
in [31] . We also refer to |321 [33] for a numerical analysis of a class of nonlinear eigenvalue 
problems, which includes the orbital- free DFT problem. 

While the finite-element basis is more versatile in comparison to a plane-wave ba- 
sis [19], prior studies have shown that the number of basis functions in a linear finite- 
element discretization required to achieve chemical accuracy in electronic structure cal- 
culations is very large (cf. e. g. [27l[29]), thus increasing the computational cost of calcu- 
lations. The use of higher-order finite-element discretizations was suggested as a possible 
solution to alleviate this degree of freedom disadvantage of linear finite-elements in elec- 
tronic structure calculations — cf. [30] in the context of spectral elements and [28] in the 
context of hierarchical higher-order finite-element discretizations. However, using higher- 
order finite-element discretizations also increase the per basis function computational cost 
due to the use of higher-order numerical quadrature rules and an increase in the band- 
width of the matrix which affects the efficiency of the iterative solution schemes. Further, 
the rates of convergence for higher-order finite-element approximations of electronic struc- 
ture theories, while numerically studied [l9l[3l] and mathematically analyzed [HI [321 [33] 
in the context of pseudopotential calculations, are not completely understood in the case 
of all-electron calculations. Thus, in the present work, we conduct a systematic study to 
investigate the viability and the computational efficiency afforded by higher-order finite- 
element approximations in electronic structure calculations using orbital-free DFT as a 
model theory. We consider this investigation as an important step towards our final goal 
of studying the effectiveness of higher-order finite-element approximations in Kohn-Sham 
DFT. 

A robust solution procedure and suitably graded finite-element meshes are key to 
examining the computational efficiency of higher-order finite-element discretizations in 
orbital-free DFT. To this end, we first investigate the robustness of viable solution 
schemes by analyzing the solvability conditions of the discrete orbital-free DFT problem 
and subsequently propose an a priori mesh adaption scheme to construct suitably graded 
finite-element meshes that will be used in the numerical investigations. We begin with 
the local real-space formulation as presented in [22[ [35] , where the problem of computing 
the ground-state energy is reformulated as a local saddle-point problem in the electronic 
fields comprising of the electron-density, electrostatic potential, and the kernel potentials 
describing the extended interactions in kinetic energy functionals. The discrete non- linear 
equations resulting from the finite-element discretization of the saddle-point problem can 
be solved using two different approaches: (i) a simultaneous solution of all fields using 
quasi-Newton methods; (ii) a staggered approach where the electron-density is solved 
using quasi-Newton methods, whereas the potential fields are computed consistently for 
every trial electron-density using iterative linear solvers. We investigate the robustness 
of these approaches by analyzing the solvability conditions for the linearized incremen- 
tal problem corresponding to a single step within the iterative scheme of the nonlinear 
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problem. We consider the scheme to be well-behaved if each iterative update is uniquely 
solvable. While the staggered approach guarantees solvability within the vicinity of the 
exact solution and is robust for any order of discretization of electronic fields (argument 
based on Proposition 3.3 of |31j). the same is not true with a simultaneous solution proce- 
dure. The discrete equations in simultaneous scheme are not solvable if the dimension of 
the vector-space approximating electron-density far exceeds that of electrostatic potential 
and the Hessian matrix corresponding to electron-density is rank deficient to a certain 
degree which depends on the initial guess. Having examined the best solution procedure 
to solve the discrete equations (staggered approach), we next turn towards determining 
an optimal finite-element mesh for the orbital-free DFT problem. To this end, we derive 
error estimates in the energy as a function of the characteristic mesh-size distribution 
/i(x) and the electronic fields comprising of the electron-density, electrostatic potential 
and kernel potentials. We remark that error estimates have been rigorously derived in 
recent studies for the orbital- free DFT problem |31l [32} 133] . though the form of these 
estimates is not useful for developing mesh adaption schemes as these studies primarily 
focused on proving the convergence of the finite-element approximation and determin- 
ing the convergence rates. We use the derived error estimates in conjunction with the 
mesh adaption scheme proposed in [36] to determine the optimal coarse-graining rates 
for the finite-element discretizations in terms of the degree of the interpolating polyno- 
mial and the exact solution fields in the orbital-free DFT problem. As the exact solution 
fields are a priori unknown, we use the asymptotic solutions of the electronic fields away 
from the nuclei, which are a priori known, to determine the coarse-graining rates for the 
finite-element discretizations that will subsequently be used in the numerical studies. We 
note that the finite-element meshes thus determined are optimal away from nuclei, but 
not necessarily optimal in the vicinity of the nuclei. We remark that obtaining optimal 
meshes away from the nuclei is still very useful for efficiently resolving the vacuum in 
non-periodic calculations. 

We next turn towards the numerical investigation of the saddle-point orbital-free 
DFT problem in the finite-element basis. The various finite-elements used in our study 
include tetrahedral elements up to second-order, hexahedral elements up to fourth-order 
as well as spectral finite-elements [37] of order three and four. We begin our study by 
computing the numerical rates of convergence for all the elements. These convergence 
studies are conducted on three benchmark problems comprising of: (i) Hydrogen atom, 
which represents a linear problem with a singular potential; (ii) Helium atom, which 
represents a non-linear problem with a singular potential; (iii) Aluminum atom, which 
represents a non-linear problem with a pseudopotential. In these studies, as well as those 
to follow, we use the meshes obtained from the proposed a priori mesh adaption scheme. 
These benchmark studies show rates of convergence in energy of 0{h?^) for a finite- 
element whose degree of interpolation is k, which denote optimal rates of convergence 
determined by the error analysis of the orbital- free DFT problem |3H [321 [33] . We remark 
that these numerical studies show optimal rates of convergence even for singular Coulomb 
potentials (Hydrogen and Helium atom) which were not analyzed in [311 [32l [33] . To the 
best our knowledge, an error analysis of orbital-free DFT considering singular Coulomb 
potentials is an open question to date. We next conduct a systematic study to assess the 
computational efficiency afforded by higher-order finite-element discretizations. To this 
end, we use the same benchmark problems and measure the CPU time for the solution 
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of the orbital-free DFT problem to various relative accuracies for all the finite-elements 
considered in this study. We find that higher-order elements are computationally more 
efficient, and can provide significant computational savings, especially in the regime of 
chemical accuracy. For instance, a staggering 100-1000 fold computational savings are 
realized with the use of higher-order elements resulting in a relative energy error of 
10~^ in comparison to linear finite-elements with a much higher relative error of around 
10~^. We also observe a 2-3 fold computational savings in using spectral finite-elements 
over conventional finite-elements of the same order. To further assess the performance of 
higher-order elements, we conducted simulations on a series of aluminum clusters ranging 
from 1 face-centered-cubic (fee) unit cell to 5 x 5 x 5 unit cells, and these studies also 
reflect a similar computational advantage afforded by the use of higher-order elements. 

The remainder of the paper is organized as follows. Section [2] describes the saddle- 
point formulation of the orbital-free DFT problem and summarizes the key results from 
the analysis of solvability conditions of the linearized system of equations corresponding to 
the discrete orbital-free DFT problem, whose detailed analysis is provided in appendix lAl 
Section [3] presents the error estimates for the finite-element discretization of orbital-free 
DFT and uses these estimates to present an a priori mesh adaption scheme. Sections H] 
and [5] describe the numerical implementation of the method and present the numerical 
examples which demonstrate the computational efficiency afforded by higher-order finite- 
element discretizations in electronic structure calculations. We conclude with a short 
discussion in Section [6l 

2 Formulation 

In this section, we present the variational formulation underlying the orbital-free DFT 
problem and a discretization of the formulation using a finite-element basis. We also 
discuss the main results from the analysis of solvability conditions corresponding to the 
discrete orbital-free DFT problem, whose detailed analysis is provided in appendix |Al If 
and M denote the number of electrons and the number of atoms in a charge neutral 
system respectively, and u = ^fp denotes the square-root electron-density such that 
/ dx = A'^, then the energy of the system described by density functional theory is 
given by (cf. e. g. [SHU]) 

E{u,K) = Ts{u) + E^^{u) + Eh{u) + EeM.R) + E^R). (1) 

where R = {Ri, • • • , Rm} is the position vector denoting the nuclear coordinates in the 
system; Tg is the kinetic energy of the non-interacting electrons; Exc is the exchange cor- 
relation energy; Eh denotes the classical electrostatic interaction energy of the electron- 
density also referred to as Hartree energy; Ef^^t is the interaction energy with the external 
field, Vext, induced by nuclear charges and Ezz is the repulsive energy between nuclei. 
In orbital-free DFT, Tg is modeled using explicit functional forms of electron-density, 
and we employ, in our study, the Thomas- Fermi- von Weizscaker(TFW) family of func- 
tionals [3], as well as, the more accurate Thomas-Fermi- von Weizscaker functional with 
kernel energies OE]. In what follows, we present the variational formulations for each 
of these cases. 
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2.1 The Variational Problem 

The TFW family of functionals have the following representation 

Ts{u) = Cf( ^/i°/='(x)dx + ^ / |V^x(x)pdx, (2) 

where Cp = ^(37r2)2/3 and A is a parameter. Though different values of A are found to 
work better in different cases, A = ^ is chosen in this study since this was found to be 
an optimal choice for various atomic systems [3]. E^c, the exchange correlation energy 
in functional ([T]), is modeled using the local density approximation (LD A) [381 139] and is 
given by 

E^ciu) = / exc(^i(x))ti^(x) dx, (3) 
where Sxc = + £c is the exchange and correlation energy per electron given by 

'4 (^) 



r, > 1 



I ^4 log + i? + Cr^ log + Dr^ < 1, 

where = ij^)^^^- The values of constants used in this study are those of an un- 
polarized medium, and are given by 7^ = -0.1471, Piu = 1.1581, f32u = 0.3446, Au = 
0.0311, Bu = -0.048, Cu = 0.0014, L»„ = -0.0108. The electrostatic interactions in the 
functional ([TJ have the following form: 

Eh{u) = - / ^^^dxdx', (6) 
^ yiR3 Je3 |x-x'| 

Eext{u,R)= f n2(x)yext(x)dx = V / n^(x) \| Q?x, (7) 

MM 

The electrostatic interaction terms as expressed in equations ([S])-([5]) are nonlocal in real- 
space, and, for this reason, evaluation of electrostatic energy is the computationally 
expensive part of the calculation. Following the approach in |22j . the electrostatic inter- 
action energy can be reformulated as a local variational problem in electrostatic potential 
by observing that ^ is the Green's function of the Laplace operator. To this end, we 
consider a nuclear charge Zj located at Rj as a bounded regularized charge distribution 
Zj6iij{x) having a support in a small ball around Rj and a total charge Zj. The nuclear 
repulsion energy can subsequently be represented as 

EUR) = 11 I 'P^d.d.', (9) 

^ ./lB>3 ./in.3 X — X 
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where 6(x) = "^^=1 ^i^Ri(^)- remark that, while this differs from the expression 
in equation ([8]) by the self-energy of the nuclei, the self-energy is an inconsequential 
constant depending only on the nuclear charge distribution, and is explicitly evaluated 
and subtracted from the total energy in numerical computations. Subsequently, the 
electrostatic interaction energy, up to a constant self-energy, is given by the following 
variational problem: 

1 f f n2(x)u2(x') ^ . r 2, r ^ f f fe(x)6(x') ^ ^ , 

2 Jrs |x - X I J^a 2 J^a J^s |x - x | 



inf lJ-/" |V<A(x)|2<ix- / (n2(x) + 6(xMx)dx|, 



(10) 



where (pix) denotes the trial function for the total electrostatic potential due to the 
electron-density and the nuclear charge distribution and 3^ is a suitable function space 
that guarantees existence of a minimizer. Using equation (|lUp . the energy functional ([T]) 
is reformulated in a local form given by 



E{u,R) = sup L{u,(p,R), (11) 
<pey 



where 



L{u,4),R) = Cf [ u^^/^{x)dx+- [ \Vu{x)\'^dx+ [ exc{u^{x))u^{x)dx 
[ \V4>{^)\'dx+ [ {u\x)+b{x))4>{x)dx. 

OTT Jk3 Jk3 



(12) 



Finally, the problem of determining the ground-state energy and electron-density for 
given positions of the nuclei can be expressed as the saddle-point problem 

inf sup L(n, (/>, -R) subject to: / u'^{x)dx = N, (13) 

where X denotes a suitable function space that guarantees the existence of minimiz- 
ers. We remark that numerical computations are done on a finite-domain, i7, which in 
non-periodic calculations corresponds to a large enough domain containing the compact 
support of u for all practical purposes, and which in periodic calculations corresponds 
to the domain defining the unit cell. X = y = Hq[Q) for non-periodic problems and 
X = Hp^j.{Q,) and y = {(j) : (j) £ Hp^^{Q,), /(^ = 0} ^ov periodic problems guarantees exis- 
tence of solutions for the saddle-point in (jlSp . and refer to |22j for further details. We also 
refer to [40] for a comprehensive mathematical analysis of various fiavors of orbital-free 
DFT models on unbounded domains. 

We now turn towards the variational formulation of the orbital-free DFT model with 
kernel energies. The kinetic energy of non-interacting electrons, Ts(u), modeled using 
kernel energies has the following representation 

Ts{u) = Cf [ u^'>/^{x)dx+l [ \Vu{x)\^dx + Tk{u), (14) 



7 



Motamarri, Iyer, Knap, & Gavini 



where is the non-local kernel energy and is given by 

Tk{u)= [ [ n2"(x)i^(|x-x'|,n(x),'u(x'))n2^(x')dxdx'. (15) 

Jr3 J^3 

The kernel K is chosen such that Tg satisfies the Lindhard susceptibility function for an 
appropriate choice of the parameters a, (3 [QE]. Further, these kernels are referred to as 
density dependent (DD) or density independent (DI) kernels based on the dependence or 
independence of kernel K on u. It is a common practice to decompose the DD kernels 
into a series of DI kernels through a Taylor expansion about a reference density [TJ HI] 
and a similar approach is followed in this work. 

Similar to the electrostatic interaction energy, the kernel energy has interactions that 
are extended in real space. The local reformulation of non-local kernel energies into a 
saddle point problem has been addressed in [35], and the formulation is provided here for 
the sake of completeness. Defining the potentials Kt(x) and V/3(x) as 

K.(x)= / A'(|x-x'|)n2"(x')dx', (16a) 




(16b) 



we take the fourier transform to obtain Va{k) = K{k)u^°'{k) and ya(fc) = K{k)v?'l^{k). 
As shown in [H], K can approximated to a very good accuracy using a sum of partial 
fractions of the following form: 

j=i ' ' 

where Aj, Bj, J = 1- ■ ■ m are complex constants determined using a best fit approxima- 
tion. Using this approximation, the potentials defined in (jl6p take the form 

m 

F„(x) = J][a;-^(x)+AV-(x)] , 
J=i 

m 

j=i 

where aj"-^(x) and uj^-'{'x) are referred to as kernel potentials and are solutions to the 
following Helmholtz equations for J = 1 . . .m: 

-V^LO^J + Bjuj'^-' + AjBjv?'' = , (19a) 
-V'^uj^-' + Bjoj^-' + AjBju^l^ = . (19b) 

In the notation introduced in the above equation, we note that aj in w"-^ is only a super- 
script, and should not be interpreted as the power of u (and likewise for oo^-'). For con- 
venience of notation, we define oj°^ = {oj°^^ , cj'^^ , • • • , w"™ } and CjI^ = {uj^^ , u^'^ , • • • , w'^™ } , 
which denote the vectors containing the corresponding kernel potentials. The Helmholtz 



(18a) 
(18b) 
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equations in equation (jl9|) can be expressed in a variational form which allows us to 
reformulate the non-local energies in (llSp as the following local saddle-point problem: 



where 



Tk{u)= inf sup L(n,a}°,a}^), (20) 



m „ 



— lAj Bj Aj ^21) 



In the saddle-point problem in equation (j20p . the function space is chosen to be ^ = 
(i?pg^(f2))™' for periodic problems where i? denotes the unit-cell in the periodic calcula- 
tion, and we do not treat non-periodic problems with kernel functionals. Using the local 
reformulation of the electrostatic interactions and kernel energies from equations (]10p 
and (|20p . the problem of computing the ground-state properties in orbital- free DFT, 
for a fixed position of atoms, can now be expressed as the following local saddle-point 
problem in real-space on bounded domains as: 

inf sup inf sup L(u,(j),Cj°' ,u!'^ , R) subject to: / dx = A^, (22) 

where 

L{u, 0, Cj'',Cj^,R) = L{u, 0, R) + L{u, Cj'^,^^). (23) 

The local variational formulation of the ground-state electronic-structure described above 
forms the basis of the finite-element approximation schemes discussed subsequently. 

2.2 The Discrete Problem: 

Let X"^ with dimension n^, xf^ with dimension n^, Xf^ with dimension denote the 
finite-dimensional subspaces oi X , y and Z respectively. The discrete problem corre- 
sponding to (|22p is given by the following constrained saddle-point problem: 



inf sup inf sup L{uh,<t)h,^^hi^h^ R) 

u^^Xl ^^^xt ^H^X- ^P^^x^ 



(24) 



subject to: / tt| dx = N. 
Jn 

Similarly, the discrete problem corresponding to (jl3p is given by 

inf sup L{uh,(t>h,R) subject to / u\d'x. = N. (25) 

The convergence of finite-element approximation for the orbital-free DFT model with 
TFW family of kinetic energy functionals was shown in |22j using the notion of F— 
convergence. Recent numerical analysis of the finite-element discretization also estimated 
the rates of convergence of the approximation, and we refer to |3H [32] for further details. 
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The higher-order finite element discretization of the saddle-point orbital-free DFT 
problem leads to a natural question: what is an efficient approach to solve the resulting 
nonlinear mixed- field finite-element equations in (|24p and (|25p ? Two possible approaches 
to solve the problem are considered: (i) a simultaneous solution of all the fields using 
quasi-Newton methods; (ii) a staggered approach where the electron-density is solved 
using quasi-Newton methods, whereas the potential fields are computed consistently for 
every trial electron-density using iterative linear solvers. Appendix |A] derives the solv- 
ability conditions of the linearized discrete finite-element equations arising out of (I25p . 
which provides insights into the robustness of these two approaches. The analysis of 
these solvability conditions indicates that, while the discrete system of finite-element 
equations in the staggered approach is solvable for any choice of interpolation spaces of 
the electronic fields, the simultaneous scheme imposes restrictions on the choice of the 
interpolation spaces of the fields involved. The necessary condition for the discrete sys- 
tem of finite-element equations to be solvable in a simultaneous scheme is that the rank 
of the Hessian matrix corresponding to root-electron-density (A) must be greater than 
or equal to n„ — (n,^ -|- 1) which depends on the initial guess of root-electron-density and 
electrostatic potential. Furthermore, if the finite element discretization is chosen such 
that rank(A) > n„ — (n^ + l) is always satisfied, the sufficiency condition for invertibility 
of the system matrix is given by (i) in Proposition lA.ll which is difficult to check for any 
general discretization and guess of electronic fields. Similar arguments can be extended to 
analyze the solvability conditions of the discrete finite-element problem involving kernel 
potentials in ()24p . This analysis shows that the staggered solution procedure is a more 
robust approach to solve the mixed-field discrete finite-element equations correspond- 
ing to the orbital-free DFT problem and will be employed in the subsequent numerical 
investigations. 

Next we derive the optimal coarse-graining rates for the finite-element meshes using 
the asymptotic nature of the solution fields in the orbital-free DFT problem. 

3 A priori Mesh Adaption 

Following [36] . we present an a priori mesh adaption scheme by minimizing the error in 
the finite-element approximation of the orbital-free DFT problem for a fixed number of 
elements in the mesh. To this end, we first seek a bound on the energy error \E — Eh\ 
as a function of the characteristic mesh-size, h, and the distribution of electronic fields. 
We remark that error estimates have been rigorously derived in recent studies for the 
orbital-free DFT problem [311 1321 [33] . However, the form of these estimates is not useful 
for developing mesh adaption schemes as these studies primarily focused on proving 
the convergence of the finite-element approximation and determining the convergence 
rates. In what follows, we present the derivation of error bound in terms of the root- 
electron-density and the electrostatic potential for the orbital-free DFT problem with 
TFW kinetic energy functionals. Similar error estimates for orbital- free DFT models 
with kernel energies are discussed in appendix [Bl 
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3.1 Estimate of Energy Error 

Let {uh, cph, fih) and (u, (f), p.) be the solutions of the discrete finite-element problem 
(|25p and the continuous problem (jlSh respectively for a given set of nuclear positions, 
where the nuclear charges are represented by a bounded regularized charge distribution 
6(x). The ground state energy in the discrete and the continuous formulations can be 
expressed as 



Eh{uh, 

E{u, 
where 



A 



.V^rdx + / F{uh)dx 
- [ iVupdx + I F{u)dx - 



^ [ |V4pa!x+ [ {ul + b)^hdx, 

OTT Jn Jo 

[ |V(^|2(ix+ / {U^ + b)^dx, 

^5vr Jn Jn 



.2\„,2 



Proposition 3.1. In the neighborhood of (u, (p, fi), the finite- element approximation 
error in the ground state energy can be bounded as: 



\Ey, -E\<- 



\V6u\'^dx 



n 
1 

+ 8^ 



i6uf dx 



n 



+ 



F"{u){5ufd^ 



n 



[ \V5^\'^dx 


+ 


/ {5uf^dx 


+ 2 


/ u 5u dcj) dx. 


Jn 




Jn 




Jn 



(26) 



Proof. We begin by expanding Eh{uh, (ph) about the solution of the continuous problem, 
i.e Uh = u + 5u and 4>h = cj) + 6(j)- Using Taylor series expansion, we get 



Eh{u -\-6u,(l) + 



:- [ \V{u + 5u)\'^ dx+ I F{u + 5u)dx 
2 Jn Jn 

- ^ I |V(^ + 5</))|2dx+ / (iu + 6u)^ + b) (^ + 5(j))dx, 
Jn J a 



(27) 



which can be simplified to 

Eh{uh,^h) = - [ (|Vup + |VJu|2 + 2Vn- VJu) dx+ [ F{u)dx+ [ F'{u)5udx 
2 Jn Jn Jn 

+ - [ F"{u){6ufdx- — [ (|V(^|2 + |V^(/)|^ + 2V^- V5(/.) (ix+ [ {u^ + b)^dx 
2 Jn Jn Jn 



+ 2 u5u4>dx+ / {u + b)5(j)dx+ / {5uY(j)dx + 2 / u 6u 5(j)dx + 0{5u-^ ,6(l)'^ ,6u^6(l), 
Jn Jn Jn Jn 

(28) 



Since {u, (j), fi) satisfy the Euler-Lagrange equations of the functional (jl3p . we have 



A / Vu ■V6udx+ / F'{u)5udx-\-2 / u5u(j)dx = -2 fiu5udx, (29a) 
Jn Jn Jn Jn 



— [ V^-V6(t>+ [ {u^ + b)6^ = 0. 
4vr Jq Jn 



(29b) 
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Using equation (j28|) and the above Euler-Lagrange equations, we get 

Eh-E=- [ \V6u\'^ dyL-2fl [ u6udyL+- [ F"{u){5ufdyL 
J n J Q 2 J Q 

— I |V5<^|2dx+ / {5uf^d^ + 2 [ uSu5^dx + 0{6u^,6(f>^,6u^64>,6u6(l)'^ 



8tt 



n 



n 



n 



Note that the constraint functional in the discrete form, 

c{uh) = / u\d-iL - N , 
is also expanded about the solution n, to give 



c{uh) = {u + 5uf dx-N= {v? + {5uf + 2u5u) dx-N. 



Using 



17 



dy. = N . 



Q 



and c{uh) = we get 



/ {5uf = -2 [ 



2 I u 5udx 



(30) 
(31) 

(32) 

(33) 
(34) 



Using equations (|30p and (j34p we arrive at the following error bound in energy, upon 
neglecting third-order terms and beyond 



\Eh -E\ <- 



1 

+ 8^ 



iVJupdx 



{8uf dx 



1 

+ 2 



F"{u){5ufdx 



[ |V(50pdx 


+ 


/ {5uf^dx 


+ 2 


I u 5u 54> dx. 


Jn 




JQ 




Jq 



(35) 



□ 



Proposition 3.2. The finite- element approximation error in proposition \3.1\ expressed 
in terms of the approximation errors in root- electron- density and electrostatic potential 
is given by 

\Eh - E\ < C {\\ u - Uh Wl^fi +|(^ - (t>h\i,n+ \\u-Uh \\o,n\\ 4> - 4>h \\i,n) (36) 

Proof. We make use of the following norms: \-\i,n represents the semi-norm in space, 
II • lli^n denotes the norm, || • ||o,r2 and || • ||o,p,n denote the standard and 
norms respectively. All the constants to appear in the following estimates are positive 
and bounded. Firstly, we note that 



\Vdu\^dx 
{5u)'^ dx 



A 

2 



/ I VJtip dx < Ci|u — u/i|^ , 
Jn 

fL {uh- uf dx < C2 II n - Uh ||o,t7 . 
Jn 



(37) 
(38) 
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Using Cauchy-Schwartz and Sobolev inequalities, we arrive at the following estimate 
F"{u){6ufd^ < - [ \F"{u){uh-uf\ 



n 



< Cs II F"{u) llo.nll {u-uhf ||o,f 
= C3 II F"{u) \\o,n\\ u-uh ||o,4,n 

<C3\\U-Uh Win . 



Further, we note 



1 

8^ 



|V( 



n 



< Ca 



(39) 
(40) 



Using Cauchy-Schwartz and Sobolev inequalities we arrive at 



n 



< I \{uh-u) dyi <\\ (I) Wa^nW {uh - u) ||o,n 
n 



<C^\\U-Uh ||o,4,n 

<C^\\u-Uh ||?.n • 



(41) 



Also note that 



/ u 6u 6 


(j)dx 




J a 







\u{uh - u){(t)h 



)| 

< II u \\o,6,n\\ u-Uh ||o,n|| (t>-(t>h ||o,3,Q 

< W U - Uh ||o,n|| <^ - (t>h , 



(42) 



where we made use of the generalized Holder inequality in the first step and Sobolev 
inequality in the next. Using the bounds derived above, it follows that 



\Eh-E\ <C(\\u-Uh II? 0+1 



^h\i,n+ \\u-Uh \\o,n\\ <t>- <Ph \\i,n) ■ (43) 



□ 



Now it remains to bound the finite-element discretization error with interpolation 
errors which can in turn be bounded with size of the finite-element mesh size h. This re- 
quires a careful analysis in the case of orbital- free DFT, which is a non- linear constrained 
problem, and has been discussed in |31j . Using the results from Theorem 3.2 in |31j . we 
bound the estimates in equation ()43p using the following inequalities ( cf. 



u-Uh 111 o< Co II ti - ui 



\i,n 



< 



u-Uh ||o,n< Ci II n - m ||o,n< Ci ^ /i^"+V|fc+i,o^ , 

e 

\<t) - 0/i|l,C < C-2,\(t) - 0/|l,n < (52^ /le|(^|fc+l,Q^ , 



(44a) 
(44b) 
(44c) 
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where k is the order of the polynomial interpolation, and e denotes an element in the 
regular family of finite-elements [43] with mesh-size he covering a domain f^e- Hence, the 
error estimate in the energy is given by 

\Eh-E\<cY, [hf\u\l+i,n. + hfm+,^n^ + hf+'\u\k+i,nMk+i,n. 

e 

In the above equation and the analysis to follow, for simplicity, we restrict ourselves 
to the case where a single finite-element triangulation provides discretization for both 
root-electron-density and electrostatic potential. Finally, the error estimate to 0{h'^^^^) 
is given by 

\En-E\<cY^ hf [\n\l+^,n. + \4>\l+i,n^ ■ (46) 

e 



(45) 



3.2 Optimal Coarse-Graining Rate 



We now present the optimal mesh-size distribution that can be estimated by minimiz- 
ing the approximation error in energy for a fixed number of elements. The approach here 
closely follows the treatment in [36]. Using the definition of the semi-norms, we rewrite 
equation (j46]) as 



\Ej,-E\<CY,[hf 

e=l 



dx 



(47) 



where denotes the total number of elements in the finite-element triangulation. To 
obtain a continuous optimization problem rather than a discrete one, an element size 
distribution function /i(x) is introduced so that the target element size is defined at all 
points X in and we get 



\Eh-E\<CY, 



e=l 



<C' h 



X 



n 



D^+^u{x)\'^ + \D 
i?*^+iu(x)p + |Z)'=+i0(x)|2 



dx 



dx . 



Further, the number of elements in the mesh is in the order of 

dx 



n 



/i3(x) • 



(48) 
(49) 

(50) 



The optimal mesh size distribution is then determined by the following variational prob- 
lem which minimizes the approximation error in energy subject to a fixed number of 
elements: 



mm 

h 



|Z)'=+^n(x)|2 + |D'=+i(^(x)|2 



\ dx subiect to : [ , „ , , 



The Euler-Lagrange equation associated with the above problem is given by 



2kh 



2k-l, 



(x) \D''+^u{x)\'' + \D^+'^{x 



3rj 



0, 



Ne . (51) 



(52) 
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where rj is the Lagrange multipher associated with the constraint. Thus, we obtain the 
fohowing distribution 



where the constant A is computed from the constraint that the total number of elements 
in the finite element discretization is Nf,. 

The coarse-graining rate derived in equation (j53p has been employed to construct the 
finite-element meshes for different kinds of problems we study in the subsequent sections. 



In this section, we turn to the numerical implementation of the variational formulation 
described in Section [2] using the finite-element basis. We first review the finite-element 
basis used in our study and subsequently discuss the solution procedure. 

4.1 Finite- Element Basis 

Traditionally, linear tetrahedral elements have been the elements of choice for many ap- 
plications. These elements are well suited for problems involving complex geometries 
and requiring moderate levels of accuracy. However, in electronic structure calculations, 
geometries are simpler while the levels of accuracy desired are much higher. We investi- 
gate, in our study, if the desired chemical accuracy in electronic structure calculations can 
possibly be achieved more efficiently by using higher-order finite-element basis functions. 
We restrict our attention to a small, yet representative subset of existing finite-element 
basis functions. We employ in our study C*^ basis functions comprising of tetrahedral 
elements with interpolating polynomials up to degree two (TET4 and TETIO) and hex- 
ahedral elements up to degree four (HEX8, HEX27, HEX64, HEX125). The number 
following the words 'TET' and 'HEX' denote the number of nodes in the element. We 
also investigate the use of spectral-elements, first introduced in [37], and applied to elec- 
tronic structure calculations in [30]. In conventional finite-elements, basis functions are 
constructed as Lagrange polynomials interpolated through equi-spaced nodes in an ele- 
ment, whereas, spectral-elements use an optimal distribution of nodes corresponding to 
the roots of derivatives of Chebyshev or Legendre polynomials. Such a distribution does 
not have nodes on the boundaries of an element, and it is common to append nodes on 
element boundaries which guarantees basis functions. These set of nodes are com- 
monly referred to as Gauss-Lobatto-Chebyshev or Gauss-Lobatto-Legendre points. We 
further note that, for a high order of interpolation, conventional finite-elements result in 
an ill-conditioned problem, whereas, spectral-elements are devoid of this deficiency |44j . 
In this work, we use spectral-elements with basis functions of third and fourth degree 
having nodes positioned at the Gauss-Lobatto-Legendre points. 

In the studies to follow, we use Gauss quadrature rules to numerically evaluate the 
integrals arising in the formulation. We choose our numerical quadrature rules such that 
the error incurred through quadratures is higher order in comparison to discretization 




(53) 



4 Numerical Implementation 
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errors. Further, we restrict ourselves to a single finite-element discretization for all so- 
lution fields. We next present a brief description of the solution procedure and solvers 
employed in this work. 

4.2 Solution Procedure 

The analysis in appendix [X] shows that the incremental problem (j86p corresponding to 
a single step in Newton method is uniquely solvable in a staggered approach, and hence 
we employ the staggered solution procedure in our numerical studies. In order to pro- 
vide a good initial guess for the non-linear solvers based on quasi-Newton methods, 
we first use a non-linear conjugate gradient method implemented using Polak-Ribere 
scheme [45] to reduce the residuals corresponding to root-electron-density. This solu- 
tion is subsequently used as an initial guess to the inexact Newton solver provided by 
the KINSOL pacakage |46j that rapidly reduces the residuals in the vicinity of the solu- 
tion providing quadratic convergence. This is a Newton's Method with Goldstein- Armijo 
line searches [U]. The generalized-minimal residual method (GMRES |48j ) is used as 
the associated linear solver used in the inexact Newton solver, while bi-conjugate gra- 
dient stabilized method (BIGG, [l9]) and transpose-free quasi- minimal residual method 
(TFQMR, [50]) are alternate options for linear solvers. All these algorithms are based 
on Krylov subspace methods and hence require the evaluation of Hessian matrix only 
through its product with a vector, which is approximated by directional difference quo- 
tients. The solution of electrostatic potential (Poisson equation) and kernel potentials 
(Helmholtz equations) in the staggered procedure is computed using preconditioned linear 
conjugate gradient method or preconditioned GMRES. A suite of different precondition- 
ers provided by the HYPRE [51] package is used to improve the conditioning of the 
problem and computational efficiency. 

5 Numerical Results 
5.1 Rates of Convergence 

We now study the convergence rates of the finite-element approximation with decreasing 
mesh sizes for different polynomial orders of interpolation. We use three benchmark prob- 
lems in this study, which include: (i) Hydrogen atom that represents a linear problem 
with a singular nuclear potential; (ii) Helium atom that represents a non-linear prob- 
lem with a singular nuclear potential; (iii) Aluminum atom using pseudopotentials that 
represents a non-linear problem with a smooth external pseudopotential. In the studies 
which do not use pseudopotentials, the nuclear charges are treated as point charges lo- 
cated on the nodes of the finite-element triangulation, and the discretization provides a 
regularization for the nuclear charges. We note that the self-energy of the nuclei in this 
case is mesh-dependent and diverges upon mesh refinement. Thus, the self energy is also 
computed on the same mesh that is used to compute the total electrostatic potential, 
which ensures that the divergent components of the variational problem on the right 
hand side of equation (llOp and the self energy exactly cancel owing to the linearity of the 
Poisson equation. 
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We adopt the following procedure for conducting the convergence study. The coars- 
est mesh having elements is constructed with a coarsening rate determined by equa- 
tion ()53p using the a priori knowledge of the far-field asymptotic solutions of electronic 
fields. We note that as far-field asymptotic solutions are used in (I53p to compute the 
coarse-graining rates as opposed to the exact solutions, the obtained meshes are only op- 
timal in the far-field and possibly sub-optimal near the nuclei. Nevertheless, the meshes 
constructed using this approach still provide an efficient way of resolving the vacuum in 
non-periodic calculations as opposed to using a uniform discretization or employing an 
ad-hoc coarse-graining rate. The sequence of refined meshes are subsequently constructed 
by a uniform subdivison of the coarse-mesh, which represents a systematic refinement of 
the approximation space. The finite-element ground-state energies. Eh, obtained from 
each of these subdivisions for the highest order element (HEX125) are used to fit an 
expression of the form 

\Eo - Eh\ = C(l/iVe)'''/'. (54) 

to determine the constants Eq, C and k. All convergence study plots for different orders 
of finite-elements in the subsequent sub-sections show the relative error plotted 

against (l/Nf,)^/^ for the value of Eq obtained using the HEX125 element. The slopes of 
these curves provide the rate of convergence of the finite-element approximation error in 
energy for the system being studied. 



5.1.1 Hydrogen Atom 

We first begin with the simplest system to study, namely the Hydrogen atom. Note that 
the orbital-free DFT functional in the case of a single-electron system takes a simplified 
form due to the absence of electron-electron interactions. Further, the kinetic energy 
of the single-electron system can be represented exactly by the von-Weizsacker kinetic 
energy functional with A = 1, while the electrostatic energy is solely comprised of nuclear- 
electron interactions. An analytical solution for the root-electron-density for Hydrogen 
atom is known, and the radial solution is given by 

u{r) = -^Z— exp (— r) . (55) 

In this case, using the theoretical solution for u, the optimal mesh coarsening rate as 
determined by equation ([53ll is given by the following expression 

/ 2r 
h(r) = ^exp — 

Using this coarse-graining rate, we perform the numerical convergence study with both 
tetrahedral and hexahedral elements and the results are shown in figure [TJ In the present 
case, since the ground-state energy of Hydrogen is known theoretically, this theoretical 
value of -0.5 Hartree is used for Eq in computing the relative errors in energy. 
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Figure 1: Convergence rates for the finite-element approximation of a single Hy- 
drogen atom using orbital-free DFT. 

Results show that all the elements have close to optimal rate of convergence, 0{h?^), 
where k is the degree of the polynomial. It is interesting to note that optimal rates 
of convergence are obtained although the nuclear potential approaches a singular solu- 
tion (Coulomb potential) upon refinement. We note that, while both hexahedral and 
tetrahedral elements with same interpolation orders converge in error at similar rates, 
hexahedral elements provide better accuracies for a given mesh size (HEX8 and HEX27 
elements are more accurate than TET4 and TETIO elements respectively). Secondly, 
as the relative accuracy of calculations increase, elements with linear interpolation soon 
become untenable due to the large increase in number of elements for a small drop in 
error. For instance, the computation of energy of a single Hydrogen atom to a relative 
error of 10~^ required over a million linear TET4 elements in comparison to just a few 
thousand elements with quadratic TETIO elements. Further, relative errors up to 10~^ 
are achieved with just few hundreds of HEX64 and HEX125 elements, and even higher 
accuracies are achieved with few thousands of these elements. 



5.1.2 Helium Atom 

The next example we study is a single Helium atom, the simplest system displaying the 
full complexity of an all-electron calculation. The kinetic energy is treated using the TFW 
functional with A = |, and the exchange-correlation effects are treated using the LDA 
approximation. In order to determine the mesh coarse-graining rate for this problem, we 
note that the asymptotic solution for the electron-density in the TFW model is governed 
by an exponential decay from the nucleus [3] and is given by 



p{r) 



1 



■ exp 




for r — )• oo , 



(57) 
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where fl is the lagrange multipher associated with the constraint in the orbital-free DFT 
problem. Since the decay of electron-density is dominantly influenced by the exponential 
term, the root-electron-density u{r) for the purpose of determining the mesh coarse- 
graining rate is assumed to be 



u{r) 



2^3 1 

exp(— r) where C = 7: 

vr 2 




(58) 



The electrostatic potential is governed by the Poisson equation with total charge density 
being equal to the sum of u^{r) and 26{r) and is given by 



-2exp(-2er) U + 



1 



(59) 



Using the above equations, the mesh coarse-graining rate from equation (I53p is given by 



h{r) = A 



^^'=+5 exp (-2 C r) + 4 exp (-4 ^ r) 



vr 



fc+i 

n=0 



k + l\ 2"C"(A; + 1 



n 



n] 



r 



k-n+2 



(60) 

Note that /i which appears in the above equation is a parameter not known a priori. 
Hence, we find the value of fih on a coarse mesh initially and then use in the above 
equation to obtain h{r). Using this coarse-graining rate, we perform the numerical con- 
vergence study with both tetrahedral and hexahedral elements as explained before. Fig- 
ure [2] shows the convergence results for the various elements. As in the case of Hydrogen 
atom, we obtain close to optimal convergence rates although the governing equations are 
non-linear in nature and the nuclear potential approaches a singular solution upon refine- 
ment. Recent mathematical analysis [31] shows that the finite-element approximation for 
the orbital-free DFT problem does provide optimal rates of convergence, however, this 
analysis assumes that the nuclear potentials are smooth. To the best of our knowledge 
mathematical analysis of the finite-element approximation of the orbital-free DFT prob- 
lem with singular nuclear potentials is still an open question. The present numerical 
investigation provides evidence that optimal rates of convergence are achieved for the 
non-linear orbital- free DFT problem even with singular potentials. 



-l/(2fc+3) 
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Figure 2: Convergence rates for the finite-element approximation of a single Helium 
atom using orbital-free DFT. 



5.1.3 Pseudopotential Calculations 

We now turn to pseudopotential calculations in multi-electron systems. A pseudopoten- 
tial constitutes the effective potential of the nucleus and core electrons experienced by the 
valence electrons. In solid state physics, since the valence electrons are mainly responsible 
for chemical bonding and the chemical and physical properties of solids, it is common to 
compute material properties under the pseudopotential approximation. Further, as the 
electron-density corresponding to the valence electrons is closer to a free-electron gas, 
the model kinetic energy functionals in orbital-free DFT are better approximations for 
pseudopotential computations in comparison to all-electron calculations. 



Single Atom: The first example in this category is that of a single Aluminum atom. 
Here, we use the TFW kinetic energy functionals with A = -g, the LDA approximation 
for exchange correlation functional, and the Goodwin-Needs-Heine pseudopotential [52]. 
The mesh coarsening rate is derived in the same way as described in the case of Helium 
atom. Since the decay of the electron density is dominantly influenced by the exponential 
term, the root-electron-density u{r) for this problem is assumed to be of the form 



^i'"') — \l exp(— ,^ r) , where ^ = -\ ■ (61) 
V vr 2 V A 

The electrostatic potential is governed by the Poisson equation with total charge being 
equal to the sum of n^(r) and 36{r) and is given by 

^{r) = -3expi-2^r)U + ^Y (62) 
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Using the above equations, the mesh coarse-graining rate from equation (|53p is given by 



h{r) = A 



^2k+5 ^r)+9 exp (-4 ^ r) 



vr 



fc+i 

^k+22k+i _|_ 

?i=0 



/c + 1 
n 



n 



nk—n+2 



(63) 

Figure [3] shows the rates of convergence for the various elements considered, which exhibit 
close to optimal rates of convergence. The main observation that distinguishes this 
study from the all-electron study (Helium and Hydrogen atoms) is that all orders of 
interpolation provide much greater accuracies for pseudopotential calculations for the 
same number of elements. Linear basis functions are able to approximate the energy 
to two orders of greater accuracy. Ground state energies within relative errors of 10~^ 
can be achieved even with quadratic elements like TETIO and HEX27. The reason for 
this improved accuracy can be attributed to the smooth pseudopotential field in contrast 
to the singular nuclear potential field describing interactions between the nuclei and 
electrons. 
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Figure 3: Convergence rates for the finite-element approximation of a single Alu- 
minum atom using orbital-free DFT. 



Perfect crystal with periodic boundary conditions: The next example con- 
sidered is that of a perfect Aluminium crystal. Periodic boundary conditions are imposed 
on the root-electron-density and the electrostatic potential on a single face-centered cu- 
bic unit cell. The kinetic energy functional is modeled using kernel energies in this case. 
The details of the local reformulation with kernel energies have been discussed in Sec- 
tion [2j Figure S] shows the rate of convergence of the energy error using the density 
independent (DI) kernel functionals. We note that the convergence trend is similar to 
that of a single atom calculation with linear basis functions approximating the energy 
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to a better accuracy in comparison to an all-electron calculation for the same number of 
elements. Simulations have also been performed to demonstrate the optimal convergence 
rates using density dependent (DD) kernels and are shown in figure [5l 
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Figure 4: Convergence rates for the finite-element approximation of bulk Aluminum 
using orbital-free DFT with DI kernel energy. 
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Figure 5: Convergence rates for the finite-element approximation of bulk Aluminum 
using orbital-free DFT with DD kernel energy. 
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5.2 Computational Cost 

We now turn towards studying the computational efficiency afforded by tlie use of liigher- 
order finite-element approximations in the orbital-free DFT problem. As seen from the 
convergence studies, higher-order finite-element approximations result in a faster rate of 
convergence. However, the need for higher-order quadratures for numerical approxima- 
tion of the integrals in the finite-element formulation increases the per element cost of 
computations in using higher-order elements. Further, the increase in the bandwidth of 
the Hessian matrix affects the performance of the iterative solvers. In order to unambigu- 
ously determine the computational efficiency of higher-order elements, we measure the 
CPU time taken for the simulations conducted on benchmark problems for a wide range 
of meshes providing different relative accuracies. We also conduct this investigation on 
larger problems comprising of Aluminum clusters with varying sizes. All the simulations 
are conducted using meshes with the coarse-graining rates determined by equation (j53p . 
All the numerical simulations reported in this work are conducted using a parallel imple- 
mentation of the code based on MPI and were executed on a parallel computing cluster 
with following specifications: (2) dual-core AMD Opteron2218 2.6GHz processors per 
node, 8 GB memory per node and IGbps ethernet connectivity between nodes. The 
various single atom calculations were executed using 1 to 8 processors (cores), while the 
results for the larger aluminium clusters discussed subsequently were executed on 40 pro- 
cessors. It was verified that code scales linearly on this parallel computing architecture 
and hence the times reported here represent the total CPU time for the calculation, which 
is equivalent to the wall-clock time on a single processor. 

5.2.1 Single Atom Calculations 

A series of meshes for each element type are constructed using the optimal mesh coarse- 
graining rates for the benchmark systems considered (cf. equations (|56 p . (|60p and (j63p ). 
The number of elements are varied to obtain finite-element approximations with varying 
accuracies that target relative energy errors between 10~^ — 10~^. These meshes are 
used to solve the orbital-free DFT problem for the benchmark problems comprising of 
Hydrogen, Helium and Aluminum atoms. For the various finite-element interpolations 
considered in the present study, the relative energy error is plotted against the total CPU 
time for simulations on the series of meshes constructed. The CPU times are normalized 
with the longest time taken in the series of simulations for a given material system. 
Figures m [7] and [8] show these results for Hydrogen, Helium and Aluminum atom systems 
respectively. 

Our results indicate that higher-order interpolations become computationally more 
efficient as the desired accuracy of the computations increase, in particular for relative 
errors of the order of 10~^ and lower, which corresponds to chemical accuracy. Even 
at moderate accuracies, HEX8 elements are ten-fold computationally more efficient than 
TET4 elements. For relative errors of 10~^ — 10"'^, quadratic HEX27 elements per- 
form similar to cubic HEX64 elements, whereas quartic HEX125 elements take relatively 
longer times clearly showing the trade-off between accuracy and computational cost. Nev- 
ertheless, all higher-order elements are computationally more efficient than linear TET4 
elements. As, we examine the relative errors lower than 5 x 10~^, quartic HEX125 el- 
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ements offer a ten-fold computational savings in comparison to cubic HEX64 elements, 
while other low-order elements start deteriorating in terms of performance. An interest- 
ing observation is that the computational advantage offered by the spectral-elements is at 
least two-fold in the case of quartic HEX125Spectral element in comparison with quartic 
HEX125 element. We attribute the improved performance of the spectral-elements to 
the better conditioning of the discrete system of equations. Comparing the results across 
different materials systems, we observe that the performance of lower-order elements is 
far worse in the case of Hydrogen and Helium atom systems in comparison to Aluminum 
atom system. For instance, at a relative error of 10"'^, the solution time using TET4 is 
more than three orders of magnitude larger than HEX125Spectral elements for the case 
of Helium atom. However, the solution time is only two orders of magnitude larger for 
TET4 over HEX125Spectral elements for Aluminium atom. 

In summary, we note the following major observations from this investigation. Firstly, 
for a given level of accuracy, hexahedral elements converge faster than tetrahedral el- 
ements with the same order of interpolation. Secondly, for chemical accuracy corre- 
sponding to relative errors lower than 5 x 10~^, the computational efficiency improves 
significantly with the order of the element. Moreover, spectral-elements are computa- 
tionally more efficient in comparison to the conventional Lagrange elements of the same 
order. As further evidence, we note that the simulations on Hydrogen and Helium atoms 
show a staggering 100-1000 fold computational savings that are realized with the use 
of HEX125Spectral elements providing a relative error of 10~^ in comparison to linear 
TET4 elements resulting in a much higher relative error of around 10~^. Lastly, quali- 
tatively speaking, the sequence of graphs of relative error vs. normalized CPU time for 
the various elements tend towards increasing accuracy and computational efficiency with 
increasing order of finite-element interpolation. We also note that the point of dimin- 
ishing returns in terms of computational efficiency of higher-order elements for relative 
errors commensurate with chemical accuracy is beyond the fourth-order, and is yet to be 
determined. 
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Figure 6: Computational efficiency of various orders of finite-element approxima- 
tions. Case study: Hydrogen atom. 
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Figure 7: Computational efficiency of various orders of finite-element approxima- 
tions. Case study: Helium atom. 
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Figure 8: Computational efficiency of various orders of finite-element approxima- 
tions. Case study: Aluminum atom. 

5.2.2 Large Aluminum Clusters 

In order to further investigate the computational efficiency afforded by higher-order ele- 
ments in orbital- free DFT, we consider larger materials systems comprising of Aluminum 
clusters containing 1x1x1, 3x3x3, 5x5x5 face-centered cubic unit cells. We restrict 
this investigation to studying the computational efficiency of linear TET4 and quartic 
HEX125Spectral elements, which constitute the extreme ends of the spectrum of elements 
studied in this work. The finite-element meshes for these clusters were constructed to be 
uniform in the cluster region, while coarse-graining away from the cluster region. The 
mesh coarsening rate in the vacuum is determined by using in equation (j53p the asymp- 
totic solution of the far-field electronic fields, estimated as a superposition of single atom 
far- field asymptotic fields given by equation (j63p . Figure [9] shows the relative errors vs. 
the normalized CPU time for the various simulations conducted. The reference time 
taken here is that of a 3x3x3 aluminium cluster simulated using TET4 elements which 
took about 3000 CPU hours (approximately 75 hours on 40 AMD Opteron 2.6 GHz pro- 
cessors), and the relative error in the energy for this simulation is of the order 10~^. As 
can be observed from the results, quartic HEX125Spectral element is over hundred-fold 
more computationally efficient and provides at least an order of magnitude greater accu- 
racy than the linear TET4 element. Further, it is interesting to note that the simulation 
performed on a 5x5x5 aluminium cluster (666 atoms) using quartic HEX125Spectral ele- 
ments takes only marginally more time than a 1x1x1 cluster (14 atoms) discretized with 
linear TET4 elements, while providing two orders-of- magnitude greater accuracy. These 
results demonstrate that the computational efficiency of higher-order elements observed 



26 



Motamarri, Iyer, Knap, & Gavini 



in single atomic systems in section 15.2.11 also holds for larger materials systems with 
varying sizes, and corroborates the advantage of using higher-order spectral-elements in 
orbital-free DFT calculations. 



■ TET4 




Normalized Time 

Figure 9: Computational efficiency of various orders of finite-element approxima- 
tions. Case study: Aluminum clusters. 

6 Conclusions 

In the present study, we conduct a numerical analysis of the finite-element discretization 
of the orbital-free DFT problem with particular focus on evaluating the computational 
efficiency afforded by the use of higher-order finite-element discretizations. We used the 
saddle-point formulation of the orbital-free DFT problem proposed in [22] as our start- 
ing point. Since a higher-order finite-element discretization of the saddle-point problem 
allows for the possibility of using different orders of interpolation for various electronic 
fields, namely the potential fields and the root-electron-density, we investigated robust- 
ness of two possible solution schemes: (i) a simultaneous scheme where all the electronic 
fields are solved concurrently; (ii) a staggered approach, where the root-electron-density is 
solved using a quasi-Newton based iterative solver, while the potential fields are computed 
consistently for every trial root-electron-density. Our analysis of the solvability condi- 
tions indicated that, while the discrete system of equations in the staggered approach is 
invertible for any choice of interpolations of the electronic fields, the discrete equations 
in simultaneous scheme are not solvable if the dimension of the vector-space approxi- 
mating root-electron-density far exceeds that of electrostatic potential and the Hessian 
matrix corresponding to root-electron-density is rank deficient to a certain degree which 
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depends on the initial guess. This non-invertibility of the discrete set of equations corre- 
sponding to the simultaneous scheme is an inherent deficiency of saddle-point problems 
which cannot be surpassed. Thus, while the staggered scheme is robust, the simultaneous 
scheme is sensitive to the initial guess, an aspect which was also observed in our numerical 
simulations. 

In order to aid our studies on the computational efficiency of higher-order finite- 
elements in orbital- free DFT, we developed error estimates for the approximation error 
in energy in terms of the ground-state electronic fields and characteristic mesh-size. Using 
these error estimates and the a priori knowledge of the asymptotic solutions of far-field 
electronic fields, we constructed mesh coarse-graining rates for the various benchmark 
problems, which include single-atom systems comprising of Hydrogen, Helium and Alu- 
minum atoms. We first investigated the performance of higher-order elements by studying 
the convergence rates of various elements up to fourth-order including tetrahedral ele- 
ments, hexahedral elements and spectral finite-elements. In all cases, we observed close 
to optimal convergence rates. It is indeed worthwhile to note that optimal convergence 
is obtained although the orbital-free DFT problem is non-linear and the potential fields 
in Hydrogen and Helium atoms are singular. Having demonstrated the optimal con- 
vergence, we subsequently investigated the computational efficiency afforded by the use 
of higher-order elements. To this end, we used mesh coarse-graining rates determined 
from the proposed mesh adaption scheme and studied the CPU time required to solve 
the benchmark problems as well as larger systems containing up to 666 atom Aluminum 
clusters. Our results demonstrate that significant computational savings can be real- 
ized by using higher-order elements. For instance, a staggering 100-1000 fold savings in 
terms of CPU time and two orders of magnitude better accuracy are realized by using 
fourth-order hexahedral spectral-elements in comparison to linear tetrahedral elements. 

The prospect of using higher-order finite-elements as basis functions for electronic 
structure calculations is indeed promising. While finite-elements have the advantages 
of handling complex geometries and boundary conditions and scale well on massively 
parallel computing architectures, their use has been limited in electronic structure calcu- 
lations as they compare unfavorably to plane- wave and atomic-orbital basis functions due 
to the large number of basis functions required to reach chemical accuracy. The present 
study shows that the use of higher-order discretizations can alleviate this problem, and 
presents a useful direction for electronic structure calculations using finite-element dis- 
cretization. As part of the ongoing work, we are currently investigating the efficiency 
of these higher-order discretizations for the solution of Kohn-Sham DFT equations, and 
their performance in comparison to plane-wave and atomic-orbital basis functions. Fur- 
ther, the implications of using higher-order finite-element approximations in the recently 
proposed quasi-continuum reduction of orbital-free DFT [23j is a worthwhile subject for 
future investigation. 
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A Solvability of the Discrete FE Problem 

For the sake of clarity and simphcity we first present the solvability conditions for the 
discrete problem ([25]). Similar arguments can be used to extend the analysis in the 
case of kernel potentials (equation (I24j) ). If and xf represent the finite-dimensional 
subspaces with dimensions n„ and n^, the finite-element approximation for the field 
variables in problem (fT3]) is given by 



k=l 



(64) 
(65) 



k=l 



where A'^^ : 1 < /c < n„ is the basis of X^, : 1 < k < is the basis of Xf^, and u^, 
(j)k are the nodal variables associated with the root-electron-density and the electrostatic 
potential. Let 

L'{uh,(l)h,fJ.h) = L{uh,(l)h) + fJ-h{ uldx-N), (66) 

where L' {uh,4>h-: t^h) denotes the Lagrangian that imposes the constraint on the total 
number of electrons in the system via the Lagrange multiplier /i/j. The stationarity 
conditions associated with the discrete saddle-point problem in (I25p are given by 



dV 

dui 



{Uh,(ph,fJ'h) 







dV 



{uh) = . 



(67a) 
(67b) 
(67c) 



The above equations reduce to the following set of nonlinear equations: 



E[/ (2ViVfc(x)-VArr(x) + Arfc"(x)Arr(x)</),(x))dx]n,+ / -F'(n,)Arr(x) dx 



fT-u r /" 

+ //,^ / A^fc"(x)A^r(x)dx 

k=i L-^^ 



Uk = 



l...nu, 



(68a) 



E[i / VAr/(x)VAr,*(x)dx]0fc- / (n2(x) + 5(x, i?)) Ar/(x) = 



1 . . . n0 , 

(68b) 
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"'U "'U /» 

~1 r=l ''^ 



UrU^ — N = . 



(68c) 



where F{u) is given by 

F{u) = Cf ^ e^c{u'^)u^ ■ 

A.l Simultaneous Solution Procedure 

We now examine the solvabihty of the above system of nonUnear equations in a simultane- 
ous approach in which ah the fields are solved in an iterative scheme using quasi-Newton 
methods. Firstly, the discrete set of nonlinear equations are linearized about the n^^ 
iteration, and the resulting incremental problem corresponds to a single step within the 
iterative scheme for the nonlinear problem. We argue that if each iterative update is 
uniquely solvable, then the scheme is well behaved. Denoting {uf^\(t)"^\ ^^^'') to be 
the trial solution fields at the end of n*^ iteration, the linearizations of the stationarity 
conditions, neglecting second-order terms and beyond, are given by 



dL' 

dui 



{u 



(n+l) ,(n+l) , (n+1) 



h 



dV 



[U 



(„+l) Jn+l). 



du 



+ E 



dL' 



dui 



+ 



5u ' 

/ (n) i(n) (n)\ 3 

K A A ) 



dui dnh 



h 



dL' 



+ E 



d^L' 



. n,. 



(69a) 
i = 
(69b) 
(69c) 



1 



The corresponding matrix equation for the incremental solution at the (n + 1)*'' iteration 
is given by 

Kx=~f (70) 

where 





( A(«) 




A(n) \ 




( Su\ 




( \ 


K = 




c 


: J 


X = 




~f = 


gin) 













[ 5m ) 







(71) 



and the matrix elements 

d^L' 



A] 



(n) 



B. 



(n) 



dui 



'■h 



A 



(n) 



d^L' 



dui 



d^L' 



(72) 
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are given by 

A, = ^ [ V7Vr(x)ViV;(x)<ix + i [ F"K(x))iVr(x)Ar;(x)(ix 

^■^^ (73a) 

Bij= [ ^x;,(x)iVf(x)iV/(x)dx, (73b) 

Ai= [ ^z^(x)iVf(x)dx, (73c) 

Cij = -^ f '^Nf{^)VNf{^)d^. (73d) 
oTT Jn ■' 

The components of force vectors 

,(„) _ dL' 



fin) ^_dJJ_ 



(74) 



represent the forcing terms in the hnearized stationarity conditions in (|69p . 

The system matrices, A^'^\ B'^'^\ C, A^"-\ and vectors f^"''' , g^^"-' and h^^^ have the 
dimensions n„ x n„, n„ x n<^, x n,^, n„ x 1, and n„ x 1, x 1, 1 x 1 respectively. 
We observe that the matrix K is symmetric since A*-"-* and C are symmetric, but it is 
indefinite and this relates to the saddle-point property of the two-field formulation (j25p . 
Further, we note from equation ()73dp that C is the discrete analog of the negative Laplace 
operator, and is hence symmetric negative definite and invertible. Also, B^""^ is symmetric 
if function spaces and have the same basis. 

The primary requirement for the robustness of the iterative solution is that K must be 
non-singular at every iteration, which will guarantee a unique incremental solution of ()70|) 
and as the norm of the forcing vector ||/|| tends to zero, the only solution admitted by 
Kx = / is the vector. In order to study the invertibility conditions of equation (I70p . 
we first solve the second partition in K matrix for 5(f) to obtain 

64) = C-i^(") - C-^sW'^Jw , (75) 
and substitute in the first partition to get 

- b(")C-^B(")^) 6u + A(") 6fi = /^"^ - B^''^C-^g , (76a) 

A(")^5-u = . (76b) 

Dropping the superscript n in the expressions to follow for notational simplicity, the 
corresponding matrix equation is given by 

Kx = f, (77) 

where 

^ = ( Tr 1 ' ^=(71' / = I M ' (78) 
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and 

H = A - BC^B^ , f = f - BC^g. (79) 

We now derive the solvability conditions of the matrix K arising in our formulation. For 
a general discussion on the solvability and stability of mixed finite-element formulations, 
we refer to [32]. We begin by defining the kernel spaces of A"^, H and the column space 
of A in the following way 

Ker{A^) = {v £ M""|A^t> = 0}, 
Ker{H) = {v eW^'lHv = 0}, and 
R = {v e M'^^li? = pA for some p G M}. 

By definition, the dimension of i? is 1, and by using rank- nullity theorem on A^ we note 
that the dimension of i^er(A^) is n„ — 1. Further, we note that every vector q G i? is 
orthogonal to every vector v € Ker{A'^). Since dim{R) + dim{Ker{A^)) = Uu, we note 
that R denotes the subspace of M"" containing all vectors that are orthogonal to Ker{A'^). 

We now state the following propositions: 

Proposition A.l. Let r € Ker{A^), then the following statements are equivalent: 

(i) s^H r = Vs G A"er(A^) ^ r = 

(ii) K is invertihle 

Proof. We shall first show (i) =^ (ii) and then show ~ (i) =^ ~ (ii). Let (i) be true and 
let 

x = [ ^ \ satisfy Kx = Q, (80) 
\ m j 

i.e., we seek the solutions I and m satisfying the following equations 

Hl + Am = Q, (81) 
A^Z = . (82) 

It is clear that any vector I which solves the above system belongs to the space Ker{A^). 
Taking dot product with some vector s E Ker{A^) on both sides of the equation (jSip . 
we get 

s^H I + ms^A = ^ s^H I + m{A^sf = ^ H 1 = 0. (83) 

Since s^H Z = is true for any s G Ker{A^), (i) implies that Z = is the only solution 
possible to ()83p . Since A is a column vector, m = follows from equation ()8ip . Thus, 
Kx = admits a; = as the only solution, and therefore it follows that K is invertible. 

Next let us suppose that (i) is not true, which implies that there exists a non-zero 
vector Z G Ker{A^) satisfying s^H I = Vs E Ker{A^). This in turn implies that the 
vector HI lies in R, the column space of A. Thus, one can find m such that Hl+Am = 0, 
and since Z G Ker{A^), it follows that A^Z = 0. Hence, a non-zero vector x = (l m) 
satisfies Kx = from which it follows that K is not invertible. Thus, it follows that the 
statements (i) and (ii) are equivalent. □ 
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Remark. If statement (i) in proposition \A.1\ is true, then Ker{H) n Ker^A?") = 0. 
Further, if H is either symmetric positive semi-definite or negative semi- definite, then 
the statement (i) is equivalent to Ker{H) D Ker{A^) = 0. 

We now turn to examine the restrictions imposed by the simultaneous approach on the 
choice of the interpolation spaces and xt. 



Proposition A. 2. If K is invertihle, rank(A) > — {n^ + 1) 

Proof. We use Proposition lA.ll to prove this. Since K is invertible, we have for any 
r G Ker{A'^) and s^H r = Vs G Ker{A'^) admits r = as the only solution. From 
Proposition lA.ll Ker{H) D Ker{A^) = and using the fact that the dim{Ker{A^)) 
is Uu — 1, it follows that dim{Ker{H)) < 1. Hence, the rank{H) > — 1. Letting 
C = —C, we have 

n„ - 1 < rank{H) < rank{A) + rank{BC~^ B^) (84) 

Using the fact that rank of {BC ^B'^) never exceeds the rank of (C^^), it follows that 
rank{BC B'^) < Ufp. Subsequently, using equation we conclude rank{A) > 

riu - {n^ + 1). □ 

Remark. We note that the nature of matrix A is completely unknown at any given 
iteration, and may not he full-rank, since it depends on the initial guess of root- electron- 
density and electrostatic potential. Further, if the finite- element discretizations are chosen 
such that riu is exceedingly larger than n^, it is possible that n^ — (n(^+l) exceeds rank{A), 
and the system of equations will not he invertihle. If the finite- element discretization is 
chosen such that riu < n^, the condition rank{A) > n„ — {n^ + 1) is always satisfied. 
But, we note that this condition is only a necessary condition for K to he invertible. The 
sufficiency condition for invertibility of K, given by (i) in Proposition \A. 11 is difficult to 
check for any general discretization and guess of the electronic fields. 

Next we analyze the staggered approach of solving the discrete orbital-free DFT problem. 



A. 2 Staggered Solution Procedure 

We now examine the solvability of the discrete system of equations in (I68p using a 
staggered solution scheme. In this approach, we note that the stationarity condition 
^{uh, (j)h) = represents the discrete version of the Poisson equation, and solve the 
linear system of equations ()68bp to obtain the electrostatic potential consistently for a 
given root-electron-density. Thus, the root-electron-density and the Lagrange multiplier 
are solved in an iterative procedure using quasi-Newton methods, while the electrostatic 
potential is computed consistently for every root-electron-density using linear iterative 
solvers. The stationarity conditions corresponding to the staggered solution procedure 
are given by 

dL' 

-^iuh,(phiuh),fJ-h) = i = l...nu, (85a) 

OUi 

dL' 

T—{uh) = 0, (85b) 
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which can be expressed as the fohowing matrix equation for the incremental problem 





(86) 



Consider the discrete form of the lagrangian in the orbital-free DFT problem written in 
the following way 

A 



: inf Cf 



[ (n,(x))W3dx+^ / |Vn^(x)|2dx 



,{[uh{i^)f){uh{-y^)fdi^ + ^h{uh{-K)) + iih[ / {uh{i^)fdi^-N 



(87) 



where 



^h{Uh{^)) 



inf 



)f + b)M^)dx 



It follows from Proposition 3.3 of [31] that the discrete problem (j87p has a uniform 
minimizer in the neighborhood of a local minimizer of the continuous functional L'{u) 
for sufficiently small h, the characteristic length of the mesh. Assuming that L'{uh) is 
locally convex in a neighborhood of the minimizer, the positive definiteness of the matrix 
A' follows, which in turn guarantees invertibility of the system matrix. We also note that 
the staggered solution scheme does not impose any restrictions on and n^. Further, 
in our numerical studies, we find the staggered solution procedure is more robust in 
comparison to the simultaneous approach, both in terms of numerical convergence of the 
scheme and CPU time. 



B Estimate of Energy Error with Kernels 

Let (u/i, 4>hi Uhi '^h''^ ^h') '^°'^) w^-^) be the solutions of the discrete finite 

element problem ([24]) and the continuous problem ()22l) respectively for a given set of 
nuclear positions. Kernel functionals are valid for periodic systems where ground-state 
electron-density u is a perturbation of uniform electron gas. Hence, we assume u is 
bounded from above and below in subsequent analysis. The ground state energy in the 
discrete and the continuous formulations can be expressed as 



m „ 



n 



1 



V'u/.p + F{uh) 



1 



Stt 



J=l 



^ \-i-OLJ I ^ -otj-dj I -P] -'la I -«7 -I'P I \ - 



\y4>h\ + {uh + b)(^h\ d^ 



ix}, 

(89) 



and 
E(u, 



|Vu|2 + F{u) 



1 

8^ 



|V^|2 + (n2 + 5)0 



dx 



J=i 



n 



1 

AjBj 



dxj, 

(90) 
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where 

F{u) = Cf + e^c{u^)u^. 

We begin by expanding Eh{uh,(j)h,Ld'^-' ,u}^'^) about the solution of the continuous prob- 
lem, i.e Ufi = u + 5u, 4>h 
series expansion, we get 



+ 06, oj' 



+ 6u}°J, cuf-^ = CjI^J + 5ujt^J. Using Taylor 



j + HP + F{u + 5u) - ^|V(<^ + <5</.)|2 + [{u + + 6] [(^ + . 



+ 



^{[ajBj 

j=i 



J 



(91) 



Since (n, (/>, /i, w"-^, o;^-^) satisfy the Euler-Lagrange equations of the functional we 
have 

„ m 
+ 2 (a + /?) 5u ^dx = - j 2flu 6u dx 



a 



An 



V(/> • V6(f) + (u^ + b)6(p 



dx = 0, 
dx = 0, 
dx = 0. 



Using equation (I9T]1 and the above Euler-Lagrange equations we get 

Eh-E= I |-|V(5up - 2jiu 5u + -F"{u){5uf - —\V5ct)\^ + {5uf^ + 2u 5u dcf) 
Jq^2 2 Svr 



Aj 

AjBj A J 



1 1 

+ V{— 5a;^-' + — — V5a;"^ • V5uj'^-' + 2a{2a- u^"-^ 



(5n) 



AjBj 



(92) 



2 a ^2 "-ijn Jw'^'^ + 2 /3(2 /3 - l)w°^ u^/^-a + 2^u^^-^6u Su'^^ 
+ 2 (a + /3)(2 (a + /3) - 1) Aj u^i^+P)-^i'^''^ 



,{5u) 



.}}dx. 
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Using equations p4p and (|92p . we arrive at the following bound in energy 



\Eh -E\< 



1 



\V6u\'^dx 



+ 



1 



F"{u){5ufdx 



n 
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+ 



1 

8^ 



|V(50p(ix 



+ 



{Suf'cj) dx 



n 



+ 2 
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-V5uj°'-' ■ V5ujl^-' dx 



qAjBj 

2au^'^'^5u5uj^-' dx 



u 6u 54> dx 
/ 2a{2a-l)uj^' VL 

J Q 



+ 

J=l 
I3j „-.2a-2 (^^ 



Aj 



Suj''-' 5uj^-' dx 



■ dx 
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Jn ^ Jn 



+ 



[ 2(a + /3)(2(Q + /3) - l)Aju 



2{a+/3)-2 



(Su) 



■ dx 



(93) 

We next find an optimal bound for the kernel terms involved in \Eh — E\. As before, 
let I • represents the semi-norm in space, || • represents the norm, || • ||o,n 
represents the standard norm, || • ||o,p,r2 represents norm. Firstly note that using 
Holder inequality, we have 

[ -^5u;°J 6uj^-' dx ^ 
Jn Aj 



Aj 
<Ci 



(^w"^ 5uj^-' dx 



Q 



a 



and 



AjBj 



VSuj""' ■ V6u}^' dx 



1 



dx < Ci II 



_ ,7,° J 



llo,^^ II ^ - llo.n 
(94) 



AjBj 

<C2 [ 



V6uj°'' ■ VSoo^' dx 



dx 



(95) 



< C2 II v(^"^ -L^°-o iio,n II iio,Q 

= C72|cD°^-w°^|i,o|c^^^-c^f |i,n. 



Using Holder inequality and the Sobolev inequality, we arrive at 



Jn 



/3j .^2a-2 



■ dx 



n 



dx 



<C3 II W^^ti2a-2 ii^^^ii ^u-Uhf llo.f 

= Cs II o'^-' tt^"-^ ||o,n|| u-uh ||o,4,n 

<C3\\U-Uh 



(96) 



Further, note that 

2au^''~^6u6oj'^' dx 



n 



< Ca 



n 



dx 



< C4, II n^" ^ l|o,6,n|| U — Uh \\o,Q\\ W"'-' — LJ'f^-' \\0,3,Q 

< C4 W u - Uh \\o,n\\ w^-^ - ^^h' \\i,Q ■ 



(97) 
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where we made use of generalized Holder inequality in the first step and Sobolev inequality 
in the next. Also one can show that 



2/3(2/3-l)cD-'^n2/3-2(M!dx 



<C5 II cD"-^n2'3-2 llo^^ll {u-Uh)^ ||o,f 
= C5 II O"-' n^^"^ ||o,q|| u-uh ||o,4,n 
<C5\\u-Uh Win 



(98) 



and 



<Cfi 



<C6\\u-Uh \\o,n\\ uj"-^ - ujj^-' ||i,f7 



(99) 



Using the bounds derived above, it follows that 

\Eh - E\ <c\\\u-Uh II^Q +10 - 4)h\\,n+ \\u-Uh ||o,n| 



'->h Win 



+ 



^||| w"^ llo.nll ^ 



- 4' \\o,n +|c^"^ - \(^^' - 4' 



j=i 



+ \\u-uh \\o,n\\ "^""^ - 4' h,^ + \\u-Uh \\o,n\\ uj^-' - ^ 



II \ 
h lli.f^/ 



(100) 



Now in the neighbourhood of the solution {u,(j), fL,u}°'-^ jCo^-'), we bound the above 
estimates with the interpolation error and then bound them with finite-element mesh 
size in the similar lines of Section [31 



e 

^"'-4' \\o,n<Co\\ O'^'-O'}-' ||o,n<C'o5]/ie+V"1fc+i,f^e, 



1"^ „ _ 
L^-^ - uj'f^" \\o,n< Co II u)^^ - 4' llo,n< Co^h),- ^jw-^ |fc+i,n. 



(101a) 
(101b) 
(101c) 
(lOld) 



Hence the error estimate in energy is given by 



1^;, - ^1 < C [hf\u\l_,^,^^ + /if |<^|^+i,n^ + /if +^|nU+i,Oe|0U+iA 



m 

+ Z^\"e l'^ \k+l,n^^ \k+l,n^ + llf. |W ■'|fe+l,Qjw'^-^|fc+l,Q^ 



J=l 



+ /jf+lUll.,,. I,7,/3.| 



fc+l,ne + /if """"^hlfc+l, 



!^el'^°"'|fc+l 



(102) 
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The error estimate to 0{h^^^^) is therefore given by 



j=i 



.}] (103) 
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